MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild
نویسندگان
چکیده
This paper addresses the problem of 3D human pose estimation in the wild. A significant challenge is the lack of training data, i.e., 2D images of humans annotated with 3D poses. Such data is necessary to train state-of-the-art CNN architectures. Here, we propose a solution to generate a large set of photorealistic synthetic images of humans with 3D pose annotations. We introduce an image-based synthesis engine that artificially augments a dataset of real images with 2D human pose annotations using 3D Motion Capture (MoCap) data. Given a candidate 3D pose our algorithm selects for each joint an image whose 2D pose locally matches the projected 3D pose. The selected images are then combined to generate a new synthetic image by stitching local image patches in a kinematically constrained manner. The resulting images are used to train an end-to-end CNN for full-body 3D pose estimation. We cluster the training data into a large number of pose classes and tackle pose estimation as a K-way classification problem. Such an approach is viable only with large training sets such as ours. Our method outperforms the state of the art in terms of 3D pose estimation in controlled environments (Human3.6M) and shows promising results for in-the-wild images (LSP). This demonstrates that CNNs trained on artificial images generalize well to real images.
منابع مشابه
Human Pose Estimation
Human pose estimation is one of the key problems in computer vision that has been studied for well over 15 years. The reason for its importance is the abundance of applications that can benefit from such a technology. For example, human pose estimation allows for higher level reasoning in the context of humancomputer interaction and activity recognition; it is also one of the basic building blo...
متن کاملCascaded 3D Full-body Pose Regression from Single Depth Image at 100 FPS
There are increasingly real-time live applications in virtual reality, where it plays an important role to capture and retarget 3D human pose. This paper presents a novel cascaded 3D full-body pose regression method to estimate accurate pose from a single depth image at 100 fps. The key idea is to train cascaded regressors based on Gradient Boosting algorithm from pre-recorded human motion capt...
متن کاملMonocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision
We propose a CNN-based approach for 3D human body pose estimation from single RGB images, that addresses the issue of limited generalizability of models trained solely on the starkly limited publicly available 3D pose data. We propose novel CNN supervision techniques, using a regularization structure while training that extends the concept of multi-level skip connections, and leverage first and...
متن کاملMoCap: Real-time Mobile 3D Motion Capture with a Cap-mounted Fisheye Camera
We propose the first real-time approach for the egocentric estimation of 3D human body pose in a wide range of unconstrained everyday activities. This setting has a unique set of challenges, such as mobility of the hardware setup, and robustness to long capture sessions with fast recovery from tracking failures. We tackle these challenges based on a novel lightweight setup that converts a stand...
متن کاملPatient MoCap: Human Pose Estimation Under Blanket Occlusion for Hospital Monitoring Applications
Motion analysis is typically used for a range of diagnostic procedures in the hospital. While automatic pose estimation from RGB-D input has entered the hospital in the domain of rehabilitation medicine and gait analysis, no such method is available for bed-ridden patients. However, patient pose estimation in the bed is required in several fields such as sleep laboratories, epilepsy monitoring ...
متن کامل